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Abstract 

A model of cellular metabolism due to S. Kauffman is analyzed. It 
consists of a network of Boolean gates randomly assembled according to 
a probability distribution. It is shown that the behavior of the network 
depends very critically on certain simple algebraic parameters of the dis- 
tribution. In some cases, the analytic results support conclusions based 
on simulations of random Boolean networks, but in other cases, they do 
not. 

1 Introduction 

Many dynamical systems are modelled by networks of interacting elements. 
Examples come from diverse areas of science and engineering and over enormous 
scales of time and space, from biochemical networks within a cell [Q to food webs 

and collaboration networks in human organizations [|l^ . Often, these systems 
are subjected to random or unpredictable processes. In this article, we are 
concerned with a class of random networks that S. Kauffman |^ |l^ proposed as 
models of cellular metabolism. These are networks of Boolean gates, where each 
gate corresponds to a gene or protein, and the network describes the interactions 
among these chemical compounds. Although Boolean networks capture at least 
some of the salient features of the operation of the genome, researchers have been 
mainly interested in certain abstract properties of their dynamics. Kauffman's 
thesis is that randomly assembled complex systems often exhibit "spontaneous 
order," that is, even though they are not constructed according to any plan, 
their behavior is often stable and robust. 

Kauffman considered several measures of order, based on the limit cycle 
that the network enters. Since a Boolean network has a finite number of gates, 
each of which has two possible states, the network itself has a finite number of 
states, and it will eventually return to some state it had visited earlier. Since 
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the network operates deterministically, it will keep repeating this sequence of 
states, which is called the limit cycle. Among the measures of order that have 
been considered are 

1. The number of stable gates — gates that eventually stop changing state. 

2. The number of weak gates — gates that can be perturbed without changing 
the limit cycle that the network enters. 

3. The size of the limit cycle. 

The key findings of Kauffman's experiments were that networks constructed 
from Boolean gates with more than two inputs were usually disordered in all 
three senses: a significant fraction of the gates never stabilized and, when per- 
turbed, caused the network to enter a different limit cycle, and the size of the 
limit c:yclc was exponential in the number of gates. But networks constructed 
from gates with two inputs tended to be ordered in all three senses, in particular, 
the average limit cycle size was on the order of the square root of the number 
of gates. 

These results raise many biological and mathematical questions. From the 
viewpoint of biology, a basic issue is whether these Boolean networks capture the 
essential features of cellular metabolism. Genes arc generally active or inactive, 
i.e., transcribing their protein or not, and the transition between the two states 
usually happens on a short time scale. Each gene tends to be directly affected 
by a small number of proteins. Thus the Boolean network model seems to be 
at least a rough approximation of cellular metabolic networks. Also, genomes 
are the result of evolution, which involves random events. However, it would be 
extremely unlikely that the simple probability distributions used by Kauffman 
are realistic. He studied two kinds of random networks contructed from 2-input 
gates. In the first kind, all of the 16 Boolean functions of two arguments are 
equally likely to be assigned to a gate. This is certainly a reasonable place to 
start, given the lack of knowledge about the actual distribution of functions 
in real genomic networks. Two of these 16 functions are constants, i.e., they 
ignore their inputs and output only one value. Such gates exhibit an extreme 
form of order, and it seemed possible that their presence was the source of 
order in networks of 2-input gates. However, Kauffman also ran simulations of 
randomly constructed networks without constant gates, where the remaining 14 
two argument functions were equally likely, and the results were similar to those 
where all 16 functions were used. 

Kauffman proposed another category of functions as the source of order. 
He called these the canalyzing functions. A canalyzing function is a Boolean 
function for which there exists some argument and some Boolean value such 
that the output of the function is determined if the argument has that value. 
For example, the 2-argument OR function x\\l xn is canalyzing because if either 
argument has the value 1, then the value of x\\l X2 is 1. Fourteen out of 
the sixteen 2-argumcnt Boolean functions, including the constant functions, are 
canalyzing, but this proportion drops rapidly among Boolean functions with 
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more than two arguments. Thus the hypothesis that nets with many canalyzing 
gates tend to be ordered, while those with few of them do not, is consistent with 
the experimental results. 

All of theses definitions and claims have precise mathematical formulations, 
so a natural question is whether the experimental results are supported by 
proofs. Interestingly, at about the same time that Kauffman started investi- 
gating random Boolean networks, the mathematical techniques for dealing with 
random networks were being developed by P. Erdos and A. Renyi ||^, ^ and 
E. Gilbert 0, but it was quite some time before any of these techniques were 
applied to the analysis of random Boolean networks. The first proofs of any of 
Kauffman's claims appear in an article co-authored by a mathematical biologist 
(J. Cohen) and a random graph theorist (T. Luczak) |Q. 

Random graph theory is now a flourishing branch of combinatorics. The 
most extensively studied version of random graph is the independent edge model. 
In this version, there is a probability p (which may depend on the number of 
vertices in the graph) such that for each pair of vertices independently, there 
is an undirected edge between them with probability p. Graph theorists have 
discovered many deep and interesting results about this kind of random graph, 
but it does not seem to be a good model of the random networks studied in bi- 
ology, communications, and engineering. A major distinction is that the degree 
distribution of this kind of graph is Poisson, but the degree distribution of many 
real-world networks obeys a power law. A better model for these situations may 
be random graphs with a specified degree distribution, which are considered in 
recent articles by M. MoUoy and B. Reed Some other shortcomings 

of the standard version of graph pointed out by M. Newman, S. Strogatz, and 
D. Watts are that it is undirected and has only one type of vertex. They 
develop some techniques for dealing with random directed graphs with vertices 
of several types. However, even this model lacks the structure needed to model 
the dynamic behavior of networks. 

Kauffman's Boolean networks are a further extension of the models in |^ 
that do include this additional structure. The gates of a Boolean network are 
vertices assigned a type corresponding to a Boolean function, and the directed 
edges indicate the inputs to each gate. But instead of simply regarding each 
vertex as a static entity, we are interested in how the functions of the gates 
change the state of the network over time. Our random Boolean networks 
are specified by a sequence of probabilities pi,p2, . . . whose sum is 1, where 
for each gate independently, pi is the probability that it is assigned the ith 
Boolean function. Once each gate has been assigned its function, its indegree 
is determined by the number of arguments of the function, and its input gates 
are chosen at random using the uniform distribution. Lastly, a random initial 
state is chosen. 

Our main results are simple algebraic conditions, derived from the distri- 
bution pi,p2, ■ ■ ■ that imply ordered behavior of the first two kinds mentioned 
above: almost all gates stabilize quickly, and almost all gates can be perturbed 
without affecting the long-term behavior of the network. Conversely, if the con- 
ditions fail, then the networks do not behave in such an ordered fashion. Our 
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conditions actually imply forms of ordered behavior stronger than Kauffman's. 
That is, the gates stabilize in time on the order of log n, where n is the num- 
ber of gates, and the effect of a perturbation dies out within order log n steps. 
Consequently, the failure of our conditions implies forms of disordered behavior 
that are weaker than the negations of Kauffman's. 

We then apply our main results to the two classes of 2-input Boolean net- 
works mentioned above. Here, our analysis verifies some of Kauffman's claims 
for networks in the first class, but it casts doubt on similar claims for the other 
class. 

2 Definitions 

A Boolean network B is a 3-tuple {V,E,f) where V is a set {!,..., n} for 
some natural number n, E is a set of directed edges on V, and f = (/i, . . . , /„) 
is a sequence of Boolean functions such that for each v G V, the number of 
arguments of is indeg(u), the indegree of v in E. The interpretation is that 
y is a collection of Boolean gates, E describes their interconnections, and f 
describes their operation. 

The gates update their states synchronously at discrete time steps 0, 1, 

At any time t, each gate v is in some state Xy G { 0, 1 }. Letting x = (.xi, .... .t„), 
we say that B is in state x at time t. Let indeg(w) = m and ui < U2 < • • • < Um 
be the gates such that {ui,v) G E fov i = 1, . . . ,m. These are referred to as the 
in-gates of v. Then the state of v at time t + lisyv = fv{xui , Xum)- Letting 
y = (yi, • • ■,yn), we put B{x) = y. 

The next definitions describe the dynamical properties of Boolean networks 
that we will analyze. 

Definition 1. Let xe {0,1 }". 

1. For t = 0,1, . . . , we put i3'(x) for the state of B at time t, given that its 
state at time is x. Thxit is, 

B'^{x) = X, and 
B*+i(x) = B(B*(x)) for all t. 

We also put (x) for where y = B* (x) . 

2. Gate v stabilizes in t steps on input x if _B* (x) = B* (x) for all t' > t. 

3. For X e { 0, 1 }" and v G { 1, . . . , n }, we put x" for the state which is 
identical to x except that x1 = l — x^. 

4- Let u,v e {!,...,«,} and x £ {0,1}". We say that v affects u at 
time t on input x if B^^(x) ^ B^{x"). We put A*{v,x) = {u € V : 

V affects u at time t on input x } . 

5. Gate v is t-weak on input x if A^{v,x) = 0, i.e., B^{x) = B*{x"). Gate v 
is t-strong on x if it is not t-weak onx. Ifx is understood, we simply say 

V is t-weak or t-strong. 
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For small intervals of time, the dynamical properties described above are 
determined by the "local" structure of the network. That is, the behavior of 
a gate over the interval 0,1, ... ,t is determined by the portion of the network 
consisting of all gates that can reach the gate by a path in E of length at most 
t. Similarly, the gates affected by a given gate lie in the portion consisting of 
all gates reachable from the gate by such a path. Of course, for large enough 
t, these portions will be the entire network. The next definitions capture these 
notions of locality. 

Definition 2. 1. For any subset I , 

Sl{I) = /, and 
5^+i(J) = {u:{v,u)&E for some v € 5^(1) } fort> 0. 

That is, S*^_{I) is the set of gates at the ends of paths of length t that start 
in I. Similarly, S*_{I) is the set of gates at the beginnings of paths of 
length t that end in I. 

2. Then 

t 

Nlil) = y S^il), and 

s=0 

Nt{I)=\JSt{I) 

are the out- and in- neighborhoods respectively of I of radius t. 
We put S^{v) for S'^{{ v }) and similarly for the other notations. Thus the 
state of gate v at time t is determined by the states of the gates in S*_ (v) and 
the functions assigned to the gates in N'^^{v). 

As we will show, for sufficiently small / and t, the "typical" N*^(I) and 
N'i{I) induce a forest on {V, E), i.e., there are no directed or undirected cycles 
among their gates. If this is the case for N*^{v), then we can give a simple 
recursive definition of A*(ti,x). 

Definition 3. Let /(xi, . . . , x^) be a Boolean function of m arguments, and 
X = {x\, . . . , Xm) G { 0, 1 1™ be an assignment of O's and 1 's to its arguments. 
For i e {1, . . . ,m}, we say that argument i directly affects f on input x if 
/(x) ^ /(x')- We extend this notion to gates in a Boolean network in the 
obvious way. Given a Boolean network B where gate v has in-gates ui < ■ ■ ■ < 
Um and state x e { 0, 1 for i = 1, . . . ,m, Ui directly affects v on input x if 
B,(x)^B,(x"*). 

Lemma 1. Assume N'^(v) induces a tree on E. Then for any s < t, any 
X G {0,1}", and any gate u e S^{v), v affects u at time s on input x if and 
only if 
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1. s = and u ~ v, or 



2. s > Q and, letting w he the unique gate such that w G S"^ ^{v) H S'i(u), 
V affects w at time s ~ 1 on input x, and w directly affects u on input 
B''-i(x). 

3 Random Boolean Networks 

We will be examining randomly constructed Boolean networks. The random 
model we use appears to be sufficiently general to capture the particular classes 
of random Boolean networks in the literature. Let (f>i, (f>2T ■ ■ ■ be some ordering of 
all the finite Boolean functions, and let pi,p2, ■ ■ ■ be a sequence of probabilities 
such that X^i^i Pi = 1- The selection of a random Boolean network with n gates 
is a three stage process. First, each gate is independently assigned a Boolean 
function using the distribution pi,p2, ■ • • • That is, for each v = 1, . . . ,n and 
j = 1,2,..., the probability that gate v is assigned is Pj- Next, the in- 
gates for each gate are selected. If the gate has been assigned an m-argument 
function, then its in-gates are chosen from the (^) equally likely possibilities. 
Finally, a random initial state is chosen from the 2" equally likely possibilities. 

We make several restrictions on the distribution pi,p2,... still consistent 
with the random networks in the literature. Since we are assuming that all 
orderings of the in-gates to a gate are equally likely, for any j and k such that 

and (j)k are identical except for the ordering of their arguments, pj = pk- 
Also, for any j and k such that — -^(f>k, Pj — Pk- This implies that, for 
any gate v such that N'i{v) is acyclic, i?^(x) is equally likely to be or 1. 
Lastly, we assume that the average and variance of the number of arguments of 
a randomly selected Boolean function, or equivalently, the average and variance 
of the indegree of a gate, are finite. That is, letting each (jji have rm arguments, 
J2ZiP^^t e [0,oo). 

4 Branching Processes 

As will be shown, for t not large compared to n, the typical N!^{v) induces a tree 
in a Boolean network with n gates. A perturbation of the state of such v may 
cause perturbations to the states of S\{v) in the next step, then and so 

on, in a "wave" that propagates through N^{v). It is possible that this wave 
dies out and the effects of the perturbation are transient, i.e., gate v is weak. 
We will show that this behavior can be approximated by a branching process. 
Then, by applying basic results about branching processes, we will derive our 
results about weak gates. We will summarize the results that we need. For 
more information on branching processes, see T. Harris 1^. 

A branching process can be identified with a rooted labelled tree. The tree 
may have infinite branches. Each node will be labelled with the unique path from 
the root to that node. That is, the root is labelled with the null sequence. If the 
root has k children, they are labelled with the sequences (1), (2), . . . , (fc). If the 
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second child of the root has I children, then they arc labelled with the sequences 
(2, 1), (2,2), . . . , {2,1), and so on. Generation t consists of all nodes labelled 
with a sequence of length t. The number of children of any node is independent 
of the number of children of any other node, but the probability of having a 
certain number of children is the same for all nodes. Thus the probability space 
of a branching process is determined by a sequence {qk ■ k = 0,1, .. .) where qk 
is the probability that a node has k children. The probability measure on this 
space will be denoted by bpr. In describing events in this space, P will denote 
a branching process. If x is a property of branching processes, P \= X means x 
holds for P, and bpr(P |= x) is the probability that x holds. 

For t > 0, P \ t will be the finite labelled tree which is P restricted to its 
first t generations. Zt will be the random variable which is the size of generation 
t, i.e., the number of nodes of depth t. 

The generating function of the branching process is the series 



k=0 

That is, F{z) is the probability generating function of Zi since qk = bpr(Zi = 
k). A basic result is that the t-th iterate of F{z) is the probability generating 
function of Zt. The iterates of are defined by 

Fo{z) = z and 
Ft+,{z) = F{Ft{z))hvt>0. (1) 

Then 

Theorem 1. The probability generating function of Z^ is Ft{z), i.e., 

oo 

Fi(^) = ^bpr(Z( = fc)z'=. 

k=0 

This enables us to express the moments of Zt in terms of the moments of Zi , 
which in turn have simple representations in terms of the derivatives of F{z). 
Let /i and be the first and second moments of Zi, that is, fi — E(Zi) and 
CT^ = var(Zi). 

Theorem 2. We have 

11 = F'{1) and 
= F"{l)+F'{l)-{F'{l)f. 

More generally, for all t>0, the first and second moments of Zt are 

E{Zt) = M* and 

var(Z,)= ^^-M 

[ta^ if 11=1. 
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5 Weak Gates 



In this section, a and /3 will be positive constants satisfying 2a log 5 + 2/3 < 1 
and a log (5 < /?, where 5 = E(TOi). 

Lemma 2. Let S C { 1, . . . , n}, \S\ < , and t < a log n. The following events 

have probability 1 — o(l); 

1. For every v € S, N'!_(v) induces a tree in {V,E). 

2. For every distinct u,v € S, N*_{u) D N*_{v) = 9 . 

Proof. We show that each of these events fails with probability o(l). The cal- 
culations are similar for both events, and we show the work only for event 1. 
If 1. fails, then there exist distinct gates vi,. . . ,Vs such that 

s < a log n, 

for i = 1, . . . , s — 1, is an in-gate of fj+i , and 

Vs G S, 

and distinct gates wi , . . . , such that 
r < alogn, 

for i = 1, . . . , r — 1, is an in-gate of Wj+i, 
wi = vi, and 

for some h G {1, . . . , s}, Wr = Vh- 

Either h above is 1 or greater than 1. The two cases are similar, and we will 

describe only the second. Therefore we can assume r > 2. Now s, r, and h 
can be chosen in 0{{\ogn)^) ways. The gates vi, . . . ,Vs and W2, ■ ■ ■ , Wr-i can 
be chosen in 0(n*+''~^+^) ways. For each j e {1, . . . ,s — 1} — {h — 1}, the 
probability that vj is an in-gate of vj+i is 



Similarly, the probability that each Wj is an in-gate of Wj+i for j = 1, . . . , r — 2 
is S/n. The probability that both Vh-i and Wr-i are in-gates of Vh is 




S 



n 




= 0(n-2). 



8 



Altogether, the probability that 1. fails is 

0{{lognf X n'+'-^+f^ X {S/ny+'-^ x n,-'^) = ©((logfj^^iogn^/S-i) 

= 0(((logn)V"'°«*+''-i) 
= o(l). 



□ 



We will use the branching process defined as follows. For each i = 1,2, . . . let 
have rrii arguments, and define 



oo nii 



|{ X G { 0, 1 }™' : argument j directly affects (jji on input x}| 
^ = l^P'l^ ■ 



Thus A may be regarded as the average number of arguments that directly 
affect a random Boolean function with a random input. Since we are assuming 
all orderings of the arguments of a Boolean function are equally likely, we can 
simplify the definition of A to 



|{ X G { 0, 1 }™' : argument 1 directly affects on input x }| 
A = ^p.m, — ■■ . 



(2) 



The branching process is defined by 

A^- 



for fc = 0, 1, . . . . Therefore F{z) = e^^"^. From Theorem |, 




Definition 4. Let T be a labelled tree of height t, B = {V, E, f) be a Boolean 
network, and x G { 0, 1 }" be its state. For u G { 1, . . . , n }, we put T =J> v if 

N'^_{A*{v,ii)) induces a tree in {V, E) , and 

there is an isomorphism from T onto {A*{v,x),E). 

Lemma 3. // \T\ < and the height of T is t < alogn, then for all x G 
{ 0, 1 pr(r =^ w) = bpr(P \t^T){l + o(l)). 
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Proof. By Lemma ^ if there is an isomorphism r from T onto {A*{v,'x.),E), 
then almost surely A^l x)) induces a tree in {V,E). Thus we need only 

analyze the probability that r exists. Let ui, . . . ,Us be the non-leaf nodes of T, 
in lexicographic order. The construction of r is recursive and proceeds in stages 
1, . . . ,s. At each stage j, T{uj) has been defined at some previous stage, and 
it is extended to the children of Uj. (At stage 1, t{ui) — v has already been 
defined.) Also, the Boolean functions assigned to these children are selected. 

Thus, assume that at stage j, t(ui), . . . ,T{uKj) have already been defined, 
where j < Kj. Let Uj have kj children. Then there are ("'^^^) ways of selecting 
the children of riuj) in A*(w,x). Having chosen these children, we next assign 
Boolean functions to them. Independently, for each child w of T{uj), let be 
assigned to it. This event has probability pi, and the probability that T{uj) is 
an in-gate of w is 

( 

Vmj-l/ _ ^ 

(") ~ 

\rai/ 

Summing over all i, we get the probability that T[uj) directly affects w: 

^ Pirrii X |{ X G { 0, 1 : argument 1 directly affects on input x }| A 

1^ 712™' ~ n" 

1=1 

Therefore the probability that these kj gates are directly affected by T{uj) is 

Since the events of assigning Boolean functions to all the gates are indepen- 
dent, the probability that the selected gates belong to A*(u,x) is 



n0l(i+o(n^^-^)). 



The probability that no other gates are in A*(w,x) is 

\T\ 



Therefore 




pr(r =^v)= \ Y[ j (1 + 0(1)) 

= bpr(P ri = T)(l + o(l)). 



□ 
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We say that a property x of branching processes depends only on the first t 
generations if, for any two branching processes Pi and P2 such that Pi \ t = 
P2 \ t, either Pi \^ x and P2 \= X: or Pi x and P2 X- Thus x can be 
identified with a set of labelled trees of depth at most t. We will also use the 
notation {A*{v,x),E) |= x to mean {A'^{v,x),E) induces a tree in {V,E) whose 
corresponding branching process satisfies x- 

Theorem 3. Let x be a property of branching processes that depends only on 
the first alogn generations. Then for all x (z {0,1 }" 

w{{A'{v,x),E) hx) =bpr(Fhx) + o(l). 

Proof. By the previous lemma, it suffices to show that bpr(|P \ alognj > n^) = 

0(1). 

If \P \ alognj > n^, then Zf > /{alogn) for some t = 1,..., alogn. 
Since E(Zt) = A* < (5* < n"i°g'5 < ti''/ (alogn, 

pr(^t > n''/(Q; log n)) < _ ^ — ^ by Chebyshev's inequality 



(n/3/(a log n)-E(ZO)- 
X2t-i ^ ^2t-2 + . . . + A* 



n^/(alogn) — A*) 

ifA^l 



if At^ 1 



(n/5/(alogn)- A*)^ 



— 0(1/ log n) in either case. 

□ 

A gate V such that iV"'°^"(A°'°sn(y x)) is acyclic is alogn- weak if and only 
if its corresponding branching process is extinct within alogn generations. 
Clearly this depends only on the first alogn generations, so Theorem ^ applies. 
By basic results from branching process theory, the probability of extinction in 
t generations is bpr(Zt = 0) = Ft{0), and limt^oo -Ft(O) = r, where r is the 
smallest nonnegative root of z = F{z). Further, when fi < 1, r = I, and when 
/i > 1, r < 1. Therefore 

Theorem 4. There is a constant r such that for all x (z {0,1 }" 
lim pr(ti is alogn-weak ) — r. 

n — ^00 

When A < 1, r = 1, and when A > 1, r < 1. 



Corollary 1. The expected number of alogn-weak gates in a random Boolean 
network is asymptotic to rn. 

A stronger result is 

Corollary 2. The number of alogn-weak gates in almost all Boolean networks 
is asymptotic to rn. 
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That is, there is a function e{n) such that e{n) and, letting the random 
variable Xn be the number of a log n-weak gates in a random Boolean network 
with n gates, 

lim pr(|X„ — rn\ < ne{n)) = 1. 

n — >oo 

Proof. By the previous corollary, 

E(X„) =rn + ne{n), 

where e(n) is a function such that limn^oo£(fi) = 0. When A < 1, r = 1, so, 
letting the random variable Yn — n — Xn, by Markov's inequality 



pr {Yn > nvM^) = 0(v/^). 

Therefore the corollary holds for A < 1. 

When A > 1, r < 1, and we need to estimate var(X„). Using methods similar 
to those in the proofs of Lemma |2| and Theorems || and p it can be shown that, 
for any two distinct gates u and v, almost surely N" x)) and 

^Qiogn^^aiogni-^ g^^.^ acyclic, their intersection is empty, and 

lim pr(M and v are a log n-weak ) = r^. 

n — ^oo 

Therefore 

var(Xn) = r(l - r)n + n'^e'{n) 
for some function £'{n) 0. By Chebyshev's inequality 

pr(|X„ -rn- nein)\ > < ^ , ^— 

n'^y'e (n) 

and the corollary also holds for A > 1. □ 

When A > 1, it is also true that most of the a log n-strong gates affect many 
other gates when perturbed. 

Corollary 3. Let A > 1. For almost all random Boolean networks, if gate v is 
alogn-strong, then there is a positive W such that for t < alogn, the number 
of gates affected by v at time t is asymptotic to WX^ . 

Proof. For t>0, let Wt — Zt/ ji*" {— Zt/\^ in our case). Again by basic results 
from branching process theory, there is a random variable W such that 

bpr( lim Wt = W)^l and 
lim bpr(Zf 7^ and 11/ = 0) = 0. (3) 

From this the corollary follows. □ 
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6 Forced Gates 



Instead of analyzing the stable gates in a Boolean network, we will study the 
forced gates. Since a gate stabilizes if it is forccxl, this is a stronger condition, 
but it seems to be more amenable to combinatorial analysis. 

For the remainder of this section, t will represent a natural number in the 
range 0, . . . , alogn, and y will be a variable taking on the values and 1. 
Given a Boolean function (j){xi, . . . , Xm) and x = (xi, . . . , Xm) G { 0, 1, * j™, we 
say that x forces (j) to y if, for all x' G {0,1}™ such that Xi = x[ whenever 
Xi *, (^(x') = y. The *'s are "don't care" values, meaning their value does 
not affect the value of (j) whenever the remaining arguments agree with x. For 
example, (j) is forced by every x G {0, 1 }"*; if is a constant function, then it 
is forced by every x G { 0, 1, * }™; if (j){xi,X2) = XiV X2, then it is forced to 
by (0, 0) and to 1 by (0, 1), (1, 0), (1, 1), (1, *), and (*, 1). We can now give a 
recursive definition of forcing for the gates of a Boolean network. 

Definition 5. A gate v is forced to y in steps if fy is the constant function 

y- 

For t >0, V is forced to y int + 1 steps if, letting Ui, . . . , Um be its in-gates, 
there is x G { 0, 1, * }™ such that x forces fy to y and for each i = 1, . . . , m 
such that Xi ^ *, is forced to Xi in t steps. We say that v is forced (in some 

number of steps) if it is forced to or 1. 

It is clear that forcing is a stronger condition than stability. 

Lemma 4. // a gate in a Boolean network is forced to y in t steps, then it 
stabilizes toyint steps. 

Further, conditioning on the event that N*_{v) induces a tree, the probabil- 
ities that the in-gates of v are forced int—l steps are independent, and there 
is a recursive formula for computing the probability that v is forced in t steps. 
Since iVi(t)) is almost surely a tree for the values of t being considered here, the 
conditional probability given by the recursive formula will be asymptotic to the 
unconditional probability of being forced in t steps. 

For any natural number m and x G {0,1,*}"*, let |x|o be the number of 
coordinates of x that are 0, and similarly for |x|i and |x|*. For i = 1,2, . . . let 
Pi{zo, zi) be the polynomial in zq and zi defined by 

pf{zo,z,) = 4"'''4"'^(i - ^0 - z,)\-\'. 

xe{o,i,*}"'* 

X forces <l>i to y 

Let 

OC 

Gy{zo,zi)=Y,PiPf{zo,z^). (4) 

i=l 
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Recursively, define 

Gi{zo,zi) = G^{zo,zi), and for f > 1 
G^+i(0o,^i) = Gy{G°t{zo,zi),Gl{zo,zi)). 

Lemma 5. If N^(y) induces a tree, then the probability that v is forced to y in 
t steps is Gj^i(0,0). 

Prom the definition of G^ and the symmetry condition pi = pj whenever 

(f'i = ^'Pjj have G^{a, h) = G^{a, b) for all a and b, and therefore Gj(0,0) = 
Gl{0, 0) for alH > 1. Therefore letting 

G{z) =2G°{z/2,z/2) (5) 

and defining 

Gi(z) = G(z), and for t > 1 
Gt+,{z) = G{Gt{z)), 

Lemma 6. // N^{v) induces a tree, then the probability that v is forced in t 
steps is Gt+i(0). 

Theorem 5. There exists g E [0, 1] such that 

lim pr(ti is forced in alogn steps ) = g- 

n^oo 

Further, 

lim Gt(0) = g, 

and g is a root of the equation 

9 = G{g). 

Proof. Since 

P^{a,b) + Pl{a,b) < ^ aWo;,|x|i(i _ ^ _ j,)|x|. 
xe{o,i,*}'"* 

for nonnegative a and b such that a + b <1 
= 1, 

OO 

2G(a) <J2p^ 

i=l 

= 1 for a < 1. 

This implies that G{z) is a continuous function on [0, 1] and all Gt(0) are 
bounded above by 1. We will show that Gt(0) is a strictly increasing sequence 
in t. Then, taking g = sup(Gt(0) : t > 1), the theorem follows. 
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To show Gf(0) < Gt+i{0), again assuming that N" is a tree, note 

that the event that v is forced to y in i steps is characterized by a collection 
C of rooted trees of height at most t whose nodes are labelled with Boolean 
functions. Each of these trees is contained in the collection V of rooted labelled 
trees that characterizes the event that v is forced to y in i + 1 steps. Further, 
some of these trees in C are of height t, and their only leaves that are labelled 
with constant functions have depth t. Take any such tree and replace each leaf 
that is labelled with a constant with a subtree consisting of a node labelled with 
a nonconstant function and new in-gates all labelled with constants such that 
the state of the leaf remains unchanged. The new tree belongs to V but not C 
because v will be forced in i + 1 steps but not t steps. Therefore V is strictly 
larger than C, and Gt{0) < Gt+i(0). □ 

Corollary 4. The expected number of gates that are forced in alogn steps is 
asymptotic to gn. 

Corollary 5. The number of gates that are forced in alogn steps in almost all 
Boolean networks is asymptotic to gn. 

7 Networks of 2-Input Gates 

We now apply the general results of the previous two sections to some networks 
studied by Kauffman. As mentioned in the Introduction, he suggested that 
networks with a large proportion of canalyzing gates tend to be stable with 
high probability. A Boolean function /(xi, . . . , Xm) is canalyzing if it is forced 
by some x e { 0, 1, * }"* where Xi ^ * for exactly one i G { 1, . . . , m }. Kauff- 
man's claim seems to be supported by experiments indicating that networks 
constructed from 2-argument Boolean functions usually exhibit stable behavior, 
while those constructed from Boolean functions with more than 2 arguments 
do not. Fourteen out of sixteen 2-argument Boolean functions arc canalyzing, 
but this proportion drops rapidly among Boolean functions with more than two 
arguments. However, our analysis does not support the experimental findings. 
To explain these results, we classify the 2-argument Boolean functions into three 
categories. 

I. The two constant functions: 

/(a;i,a;2)=0 and/(a;i,X2) = 1 

II. The twelve nonconstant canalyzing functions, consisting of 

A. The four functions that depend on one argument: 

f{xi,X2) = xi and /(a;i,a;2) = -la;! 
f{xi,X2) = X2 and f{xi,X2) = ^X2 

B. The eight canalyzing functions that depend on both arguments: 
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xi V X2 and ^xi A -^X2 

^x\ V X2 and x\ A ^0:2 

xi V ^X2 and A X2 

^x\ V and x\ A X2 

III. The two noncanalyzing functions exculsive or and equivalence: 

x\ X2 and x\ = X2 

Note that each function is paired with its negation. Let a, &, and c be the 
respective sums of the probabiUties of the functions of type I, II, and III, i.e., 
a is the probabiUty that a gate is assigned a function of type I, and so on. We 
can now express the A parameter of Section || (see Equation (||)) in terms of a, 
6, and c. Clearly, if is of type I, 

|{ X G {0,1}^ : argument 1 directly affects 0i on input x }| =0. 
If is of type II. A., say (^i(x\^ X2) — x\^ then 

|{ X £ {0,1}^ : argument 1 directly affects 0i on input x }| =4, 
whereas if (^i{x\^X2) — X2, then 

|{ X e { 0, 1 }^ : argument 1 directly affects <j)i on input x}| =0. 
If (f>i is of type II. B., say (j)i{xi, X2) = xiM X2, then 

|{xG {0,1}^ : argument 1 directly affects (pi on input x}| = 2. 

Altogether, the type II functions contribute h to A. Lastly, it is easily seen that 
if (f>i is a type III function, then 

|{ X G { 0, 1 }^ : argument 1 directly affects (pi on input x}| =4, 

and therefore the type III functions contribute 2c to A, giving 

A = + 2c. 

To analyze the fixed gates, note that G{z) (see Equations (^) and (||)) is a 
weighted sum of the 16 terms 2P°(z/2,z/2) corresponding to the 2-argument 
Boolean functions. This sum can be simplified by using the above classification 
and pairing of these functions. 

If (pi is the constant function (pii{xi,X2) — 0, then P°(z/2,z/2) = 1, but if 
it is the constant function (pi{xi, X2) — 1, then P^{z/2, z/2) = 0. Therefore the 
type I functions contribute the term a to G{z). 

If (pi is a type II. A. function, say (pi{xi,X2) = xi, then Pf z/2) — z/2. 
If (pi{xi,X2) = ^xi, then {z/2, z/2) = z/2 again. If (pi{xi,X2) is a type 
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II. B. function, say xi V X2, then P°(z/2, z/2) ~ z^/4. If it is ^xi A ^X2, then 
(z/2, z/2) = z — Altogether the type II functions contribute the term 

bz to G{z). 

It is easily seen that the two noncanalyzing functions each have P°(z/2, z/2) = 
z^/2, and therefore G(z) = a + bz + cz^ . The roots of the equation 

z = a + 6z + cz^ (6) 

are 1 and a/c. Since G(z) is positive and increasing on [0, 1], the smaller of the 
two roots is also limj^oo Gt(0). Therefore by Theorem ||, the probability that 
a gate is forced in alogn steps is asymptotic to min(l, a/c). 

In summary, for almost all Boolean networks, almost all gates are alogn- 
weak if and only if 6 + 2c < 1 , and almost all gates are forced in a log n steps 
if and only if a/c > 1. Since a + & + c= 1, 6 + 2c <1 is equivalent to c < a. 
Therefore both types of ordered behavior hold if and only if a > c.|^ 

Kauffman performed extensive simulations on two classes of random net- 
works constructed from 2-argument Boolean functions. In the first class, all 16 
of these functions were equally likely to be assigned to a gate. In the second, 
no constant functions were used, and the remaining 14 functions were equally 
likely. In the first case, a = 1/8, b = 3/4, and c = 1/8, giving A = 1 and g — I 
as the only solution to Equation (^). Therefore in this case, almost all gates 
are weak and stable in alogn steps. But in the second case, a — 0, b = 6/7, 
and c = 1/7, giving A = 8/7 and g = as the smaller root of (Q). Thus in 
this nontrivial fraction of the gates are a log n-strong and not forced in 

alogn steps. 



8 Conclusions and Open Problems 

Our analysis for the case a > c supports the experimental results for networks 
of 2-input gates when all 16 2-argument functions are equally likely. In fact, it 
gives stronger results than the conclusions of the experiments in three senses. 
Kauffman's notion of weakness requires only that the network should eventually 
return to the same limit cycle after a perturbation, but we have shown that 
with high probability, within alogn steps, the network will return to exactly 
the same state it would be in without the perturbation. Also, as mentioned 
earlier, forcing is a stronger condition than stability. Lastly, the experiments 
indicated that almost all gates were weak and stabilized for almost all inputs, 
while we have shown that almost all gates are weak and forced for all inputs. 

On the other hand, there is a qualitative difference in the behavior of random 
Boolean networks when a < c, and networks constructed from only the 14 
nonconstant 2-argument functions belong to this category. However, this does 
not necessarily contradict Kauffman's claim that these networks also display 
ordered behavior since he stated only that, when perturbed they eventually 

^ Articles Jill and [|l2{ contain proofs that a > c implies these kinds of ordered behavior; 
it was conjectured in |l2f that they fail when a < c. 
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return to the same limit cycle, and eventually almost all gates stabilize. It is 
possible that the effects of a perturbation vanish after alogn steps, and most 
gates stabilize after alogn steps. Thus one open problem is to determine the 
long-term behavior of nets where a < c (or more generally, when A > 1 or 
g < 1), to see if the analysis agrees with the simulations. 

We have not addressed the third of Kauffman's notions of order — the size of 
the limit cycle, which Kauffman claims is of the order ^/n for 2-input networks. 
It has been shown that when a > c, not only is the average size of the limit 
cycle 0{^/n), it is bounded by a constant with probability asymptotic to 1 |pl| . 
However, when a — c, the average size of the state cycle is superpolynomial in n 
[ p!2| . To our knowledge, this is the only analytic result that directly contradicts 
any of Kauffman's claims. The size of the limit cycle is not known when a < c. 
We conjecture that it is superpolynomial in this case also. More generally, it 
would be interesting to know if the size of the limit cycle is determined by the 
A or g parameters. 

We have shown that one condition, a > c, imples both a large number of 
weak gates and a large number of forced gates in networks of 2-input gates. In 
the general case, two different conditions were used to characterize these forms 
of order: A < 1 for weak gates, and g = 1 for forced gates. Is there a single 
algebraic condition that characterizes both kinds of order? 

Other questions pertain to the effect of increasing the indegree of gates. If 
we consider networks where each gate has K inputs (using the uniform distri- 
bution), then as mentioned in the Introduction, the simulations indicate that 
when K = 2, ordered behavior is very likely, but when K > 2, the networks 
tend to be disordered. We have described the results for K = 2 above. A simi- 
lar analysis for K > 2 remains to be done. Using a different model of random 
Boolean network, B. Derrida and Y. Pomeau Q have provided evidence sup- 
porting the simulations. In their version, at each step, each gate is randomly 
re-assigned its Boolean function and its inputs. They referred to their model 
as the "annealed" version and Kauffman's as the "quenched" version. They 
showed that, given any two arbitrary initial states, as the two systems evolved 
over time, their Hamming distance (the number of gates on which they differ) is 
approximated by ckti for some constant ck that depends on K. When K = 2, 
Ck = 0, but when if > 2, > 0. Of course, when K ~ 2, the quenched model 
behaves in this way because almost all of the gates are forced. But it is not 
known whether it holds for quenched models when K > 2, and the relationship 
between the annealed and quenched models is not well understood. 

Lastly, there is a network model that has some of the properties of both 
the annealed and quenched models. Here, the gates and their connections are 
fixed as in the quenched model, but at each step, a random collection of gates 
updates their states. In other words, the gates operate asynchronously. As with 
the annealed model, an asynchronous network need not enter a limit cycle, but 
the other notions of order are still meaningful, and perhaps they can be studied 
productively. 
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